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Abstract 

All failure detection methods are based cn the use of redundancy, that 
is on (possible dynamic) relations among the measured variables. Conse- 
quently the robustness of the failure detection process depends to a great 
degree on the reliability of the redundancy relations given the inevitable 
presence of model uncertainties. In this paper we address the problem of 
determining redundancy relations which are optimally robust in a sense 
which includes the major issues of importance in practical failure detection 
and which provides us with a significant amount of intuition concerning the 
geometry of robust failure detection. 
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I 


In this paper we consider the issue of robust failure detection. In 
one way or another all failure detection methods generate signals which tend 
to highlight the presence of particular failures if they have actually 
occurred. However, if any model uncertainties have effects on the obser- 
vables which are at all like those of one or more of the failure inodes, these 
will also be accentuated. Consequently the problem of robust failure de- 
tection is concerned with generating signals which are maximally sensitive 
to some effects (failures) and minimally sensitive to others (model errors) . 

The initial impetus for our approach to this problem came from the 
work reported in [5, 13] which document the first and to date by far most 
successful application and 'light testing of a failure detection algorithm 
based on advanced methods which use analytic redundancy. The singular 
feature of that project was that the dynamics of the aircraft were decomposed 
in order to analyze the relative reliability of each individual source of 
potentially useful failure detection information. 

In (2] we presented thf results of our initial attempt to extract the 

essence of the method used in (5, 13] in order to develop a general approach 

* 

to robust failure detection. As discussed in that reference and in others 

* 

(such as [3, 7-9]), all failure detection systems are based on exploiting 
analytical redundancy relat. ons or (generalized) parity checks . These are 
simply functions of the tem] oral histories of the measured quantities which 
have the property that they are small (ideally zero) when the system is 
operating normally. In [2] we present one criterion for measuring the re- 
liability of a particular redundancy relation and use this to pose an 
optimization problem to deti rmine the most reliable relation. In [3, 19] we 
present another method whic! has some computational advantages not found 


in the approach described in (2] . 

* 

,In this paper we describe the major results of [ 2 , 3, 19], In the 
next section we review the notion ol analytic redundancy for perfectly 
known models and provide a geometric interpretation which fonts the start- 
ing point for our investigation of robust failure detection. Section 3 
addresses the problem of robustness using our geometric ideas, and in that 
section we pose and solve a first version of the optimum robust redundancy 
problem. In Section 4 we discuss extensions to include three important 
issues not included in Section 3: scaling, noise, and the detection/robust- 
ness tradeoff. 


ORIGINAL PAGE 18 
OF POOR QUALITY 




-4- 


2. Redundancy Relations 

Consider the noise- frei discrete-time model 

x(k+l) * Ax(k) + Bi (k) 
y (k) = Cx (k) 
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( 2 . 1 ) 

( 2 . 2 ) 


where x is n-dimensional , u is m-dimensional, y is r-dimensional, and A, B, 
and C are perfectly known. A redundancy relation for this model is some 
linear combination of preseit and lagqed values of u and y which should be 
identically zero if no changes (i.e. failures) occur in (2.1), (2.2). As 
discussed in [2, 3, 19], redundancy relations can be specified mathemati- 
cally in the following way. The subspace of (p+1) r-dimensional vectors 
given by 


G A 
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(2.3) 


is called the space of parii y or redundancy relations of order g. The reason 
for this terminology is the following. Suppose that cj e G. Then (2.1) - 
(2.3) imply that if we part: on u> into (p+1) subvectors of dimension r 


w 


u pi 


then at any time k 

P 


i-1 


r(k) ® [y(k-p+i) - CA* 1 Bu(k-p+j)] « 0 

The quantity r(k) is called a parity check . A simpler form for (2.5) 

(which we will use later) c; n be written in the case when u - 0 (or, equiva- 
lently, if the effect of thi inputs are subtracted from the observations 
before computing the parity check) . In this case 


(2.4) 

r* 


(2.5) 


5- 


w 


r(k) « w* 


y(k-p) 

y(k-p+l) 

■ 

y(k) 
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( 2 . 6 ) 


To continue our development, let us assume that 


u> * 0 
P 


(2.V) 


Let us denote the components of uk as 


w i = (o> il , • . . .w ir l 


( 2 . 8 ) 


Since at least one element of to is nonz -ro , we can normalize to so this 

p 

component has unity value. In order to Llltstrate several points# let us 
assume that the first component, ut ^ * 1. in this case (2.5) can be re- 
written as 


y l°° * ■ Jo “ll y l (lt -P* i > - ISO J 2 ‘"is V k ' p+1) 


p i-1 . _ 

+ w'i CA 1 " 3 " Bu(k-p+j)) - 0 


(2.9) 


There are two very important interpret, tions of (2.9). The most 
obvious is that the right-hand side of this equation represents a synthetic 
measurement which can be directly compared to y^(k) in a simple comparison 
test. The second interpretation of (2.9) it as a reduced-order dynamic 
model. Specifically this equation is nothing but an autoregressive-moving 
average (ARMA) model for y^k). (From the point of view of the evolution 
of y. according to (2.9), y_,...,y and the components of u are all regarded 
as inputs) . This second interpretation, allows us to make contact with the 
numerous existing failure detection methods. Typically such methods are 
based on a noisy version of the model (2.1), (2.2) representing normal 
system behavior together with a set of deviations from this model 


.* — - - .......... ... . .;. ........ , . 

t* * 

ORIGINAL PAGE 18 
OF POOR QUALITY 

representing the several failure modes. Rather than applying such methods 
to a single, all-encompassinq model as in (2.1), (2.2), one could alterna- 
tively spply the same techniques to individual models as in (2.9) (or a 
combination of several of those), thereby isolating individual (or specific 
groups of) parity relations. For example, this is precisely what was done 
in (5, 13]. Hie advantage of such an approach is that it allows one to 
separate the information provided by redundancy relations of differing 
levels of reliability, something that is not easily done when one starts 
with the overall model (2.1), (2.2) which combines all redundancy relations. 

In the next two section^ we address the main problem of this paper, 
which is the determination o' optimally robust redundancy relations. Hie 
key to this approach is the observation that G in (2.3) is the orthogonal 
complement of the range Z of the matrix 

( 2 . 10 ) 

Thus (assuming u = 0 or that .he effect of u is subtracted from the obser- 
vations) a complete set of i ldopendent parity relations of order p is given 
by the orthogonal projection of the window of observations y(k), 
y (k-1) , . . . ,y(k-p) onto G. 



3. An Angular Measure of Robustness 
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Consider a model containing imperfectl t known parameters H# process 
noise w and measurement noise v: 


x(k+l) * A(H)x(k> + B(n)u(k) + w(k * (3.1) 

y (k) - C(H)x(k) ♦ v(k) (3.2) 

where T) is a vector of unknown parameters aid where the matrices A, B, C 
and the covariances of w and v are fund ion* of n. Let K denote the set 
of possible values which n can take on. In their work 12] Chow and Willsky 

used the following line of reasoning, if t ie parameters of the system were 

known perfectly and if there were no process or measurement noises , then 
according to (2.5) we could find a vector w' * loo • , . . . ,w' ] and a vector 

0 p 

U - [li Q# ... with 


f -Ui-l 

y! * . £ w! ca j b 

l 3«x+l x 


(3.3) 


so that 

p P"1 

r(k) = oi!y(k-p+i) - ^ u(k*p+l) = 0 


(3.4) 


In the uncertain case, what would seem to mike sense is to minimize some 
measure of the size of r(k). For oxairple one could consider choosing id and 
y that solve the minimax problem 

min max E (r(k)] 2 (3.5) 

to,y noc x Q (n) 

II w|| ■ 1 

Here the expectation is taken for each valu? of n and assuming that the 
system is at particular operating point* i. e . that u(k) = u Q and that x Q (n) 
is the corresponding set point value of the state. This criterion has the 
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interpretation of finding the approximate parity relation which# at the 
specified operating point, produces the residual with the smallest worst- 
case mean-square value when no failure has occurred. 

Let us make several comments concerning the procedure just described. 

In the first place the optimization problem (19) is a complex nonlinear 
programming problem. Furthermore, the method does not easily give a sequence 
of parity relations ordered by their robustness. Finally the optimum parity 
relation clearly depends upon tho operating point as specified by u q and 
x q (p). In some problems this may be desireable as it does allow one to 
adapt the failure detection algorithm to changing conditions, but in others 
it might be acceptable or preferable to have a single set of parity rela- 
tions for all operating conditions. The approach developed in this j>aper 
produces such a set and results in a far simpler computational procedure. 

To begin, let us focus tn (3.1), (3.2) with u * w ■ v ■ 0. Referring 
to the previous discussion, ve note that it is in general impossible to 
find parity checks which are perfect for all possible values of h. That is, 
in general we cannot find a subsnace G which is orthogonal to 
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Z(n) = Range 


c(n) 

C(n)A(r.) 

,c(n)A(r) p - 


(3.6) 


for all n. 

What would seem to make tense in this case is to choose a subspace G 
which is "as orthogonal as possible" to all possible 2 (n) . Several possible 
ways in which this can be don« are described in detail in J3J. In tl is 
paper we focus on the one approach which leads to the most complete picture 
of robust redundancy and which is computationally the simplest. To do this, 
however, we must make the assumption 'hat K, the set of possible values of 


V « 
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r| is finite. Typically what this would involve is choosing representative 
points out of the actual , continuous ranqe of parameter values. Here "repre- 
sentative" means spanning the range of possible values and having density ' 
variations reflecting any desired weightings on the likelihood or importance 
of particular sets of parameter values. However this is accomplished, we 
will assume for the remainder of this* paper that i) takes on a discrete set 
of values n»l,...,L, and will use the notation for A(n»i), Z^ for Z(n«i), 
etc. 

To obtain a simple computational procedure for determining robust re- 
dundancy relations we first compute an average observation subspace which 
is as close as possible to all of the and we then choose G to be the 
orthogonal complement of Z q . To be more precise, note first that the Z^ are 
subspaces of possibly differing dimensions (dim 2, » v ) embedded in a space 
of dimension N » (p+l)r. We will find it convenient to use the same symbols 
Z 1# ...,Z L to denote matrices of sizes Nxv^ i*l,...,L, whose columns form 
orthonormal bases for the corresponding subspaces. Letting M ■ Vj+..,+v l> 
wo define the NxM matrix 

Z - !Z. Z ) (3.7) 

X • m U 

Tli us the columns of S span the possible directions in which observation 
histories may lie under normal conditions. 

We now suppose that we wish to determine the s best parity checks (so 
that dim G-s). Thus we wish to determine a subspace Z q of dimension N-s. 

Hie optimum choice for this subspace is taken to be the span of the (not 
necessarily orthogonal) columns of the matrix Z q which minimizes 

II* - *jf « 3 - 81 

subject to the constraint that rank T. q * N-s. Here 1| • || p denotes the 


Frobenius norms 


(► •' 
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There arc several impoi tant reasons for choosing this criterion# one 
being that it does produce a space which is as close as possible to a 
specified set of directions. A second is that the resulting optimization 
problem is easy to solve, in particular# let :he singular value decomposi- 
tion of Z (14, 15 J be given by 

Z - U Z V (3.10) 

where U and V are orthogonal matrices, and 


l 



0 


(3.11) 


Here 0 < o, < ••*< o are the singular values of Z ordered by magnitude. 
Note we have assumed N < M . if this is not the case we can make it so 
without changing the optimur choice of Z q by padding Z with additional 
columns of zeros. It is re*dily shown (17# 18] that the matrix Z q minimiz- 
ing (3.8) is given by 



1 

o 

• 

m 

z> 

1 

o 

N 

• 
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i — 

o 

» 

Q 

Z 



Moreover, since the columns of U are orthonormal, we immediately see that 
the orthogonal complement of the range of Z q is given by the first s left 
singular vectors of Z Q # i.e. the fir?t s columns of U. Consequently 

G-lu,:...:u] (3.13) 

and u^,,...,^ are the optimum redundancy relations. 

There is an alternative interpretation of this choice of G which 
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* m 


provides sow very useful insight. |x?ci fically, recrTl that what we wiah to 
do is to find a G whose columns arc is orthogonal as possbile to ths columns 
of the Z^t that is, we would like to chooso G to make each of the matrices 
Z^G as close to zero as possible. I i fact, as shown in [3], the choice of 
G given in (3.13) minimizes 


j(s) - i E 1 ||z;g||‘ 


yielding the minimum value 


j (.i . jjj o* 
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(3.14) 


(3.15) 


There are two important points to observe about the result (3.14) f 
(3. If.). The first is that wo can now sec a straightforward way in which to 
include unequal weightings on each of the terms in (3.14). Specifically# 
if t.'.e w^ are positive numbers# then 


1 -i n^Hp - 


(3.16) 


so that minimizing this quantity is accomplished using the same procedure 


described previously but with z ^ replaced by iv” Z^. As a second point 
note that the optimum value (3.17) provides us with an interpretation of 
the singular values as measures of robustness and with an ordered sequence 
of parity relations from most to le. st robust. 


* 
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4. Several Important Extensions 

In this section we address several of the drawbacks and limitations of 
the result of the preceding section and obtain modifications to this result 
which overcome then at no fundamental increase in complexity. 

4.1 Scaling 

A critical problem with the method uH«*d in the preceding section is that 
all vectors in the observation spaces 2 are treated as being equally likely 
to occur. If there are difference in scale aiaong the system variables this 
may lead to poor solutions for the optimum parity relations. To overcoat 
this dra«*>eck we proceed as follows. Suppose that we are given a scaling 
matrix P so that with the change of basis 

f. - Px (4.1) 


one obtains a variable f, which is equally likely to lie in any direction. 

For example if covariance ana ysii has been performed on x and its covariance 
is Q, then P cum be chosen to satisfy 


Q - p" 1 (P'i" 1 


(4.2) 


and the resulting covariance <>f f is the identity. 

As a next step# recall that vhat we would ideally like to do is to chooi 
a matrix G so that 



■ m 

c i 



“1 

C.l> 

1 

G* 

c i\ 

• 

x ■ 

G* 

Vi'' 1 

# 

• 


• 

, c iV. 



• 

c./p ’ 1 

s * « 


l * G'C.f, 


(4.3) 


is as smsll as possible. In the preceding sect ion we considered all directions 


in ■ Range (C * to be on ecual footing and arrived at the criterion (4.4) 


Since all directions for f, are on equal f©« tin i, we are led naturally to the 
following criterion which takes scalin'? ini o a. count 

L , ORIGINAL PAGE 19 
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i-1 * r 

Using the result (171 cited in the pr >vicis section we see that to 
find the Nx s matrix G (with orthcnornul column) which minimizes J(s) we 
must perform a singular value decompositio!i of the matrix 

c - ic. : c, • u i: v (4.5) 

1 • s t • L 
2 2 2 

where a < o < ...< o and U ■ (u *.u ' u I. Then u. is the best parity 

12 N 1. •'« . N X 

2 

relation with as its measure of robustness, u 2 is the next best, etc., 
and J*(s) is given by (3.1a). Finally, in anticipation of the next subsection, 
suppose that we use the stochastic interpretation of i , i.e. that 

Eia*] • i u.6) 

In this case if we define the parity check vector 

V i - G*C t C (4.7) 

then 

EllluJ! 2 ! - llcjoll 2 M.M 

4.2 Observation and Process Noise 

In addition to choosinq parity relations which are maximally insensitive 
to model uncertainties it is also important tt choose relations which suppress 
noise. Consider then the mode) 

x (k+1) - AjXtk) ♦ D^wik) (4.*>) 

y (k) C t i.(k) + v(k) (4.10) 

where w and v are independent , zero-swan white noise processes with covariances 


-14- 


Q and R , respectively, 
let 
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u 


G' 


YOO *] 
* 

y(k+p) 


(4.10) 


Then using the interpretation provided in (4.7), we obtain the following 
natural generalization of the criterion C..4): 


L - 

J(s) - E E iHvlri (4.11) 

i=l 1 

where denotes expectation assuming that the ith model is correct. Assuming 
that £(k) » Px(k) has the identity as its covariance, using the whiteness of 
w and v, and performing some algebraic manipulations we obtain (3] 

J(s) * E llc’Glip + ||S G||p (4.12) 

i=l 


where S is defined by the following: 




mm 

* 
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0 0 
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C.A.D. 

Ill 

C.l>. 

i l 
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0 



C . A?~ . 

li l 

P-2 

C . A. I). ... C.D. 
ill ll 




* 

Q - 

diag (Q, ...,Q) 

(p times) 

R » 

diag (R, . . . ,R) 

((p+1) times) 


(4.13) 


(4.14) 


L 

N * E 6,56 ' * SS' (4.15) 

. . i i 
i*l 

From (4.12) we see that the effect of the noise is to specify another 
set of directions, namely the columns of S, to which we would like to make 
the columns of G as close to ortho<fonal as possible. From this it is evident 


that the optimum choice of G is computed by performing a singular value 
decomposition on the matrix 

ORIGINAL PAGE 18 
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As before (4.16) provides a complete set of parity relations ordered in terms 
of their degrees of insensitivity to model errors and noise. 

4.3 Detection Versus Robustness 

The methods described to this point involve measuring the quality of 
redundancy relations in terms of how small the resulting parity checks are 
undei normal operating conditions. However, in some cases one might prefer 
to use an alternative viewpoint. In particular there may be parity checks 
which jure not optimally robust in the senses we have discussed but are still 
of significant value because they are extremely sensitive to particular 
failure modes. In this subsection we consider a criterion which takes 
such a possibility into account. b\>r simplicity we focus on the noise-free 
case. The extension to include noise as in the previous subsection is 
str a i ght forward . 

The specific problem we consider is the choice of parity checks for the 
robust detection of a particular failure mode. We assvsse that the unfailed 


mode) of the system is 

x(k+l) « A u (n)x(k) (4.17) 

y (k) ■ (Mn) x(k) (4.18) 

while if the failure has occurred the model is 

x(k+l) - A f (n)x(k) (4.19) 

y (k) - C f (H) x (k) (4.20) 


In this case one would like to choose G to be "as orthogonal as possible" to 


* 
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Z u (n) and “as parallel as possible" to Z f (n). 
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Assvme again that n takes on one of a finite set of possible values, and 

let C . and C.. denote the counterparts of C. in (4.3) for the unfailed and 
ui fi 1 

failed models, respectively. A natural criterion which reflects our objective 
is 


J (s) ■ min 7. {||c'G||‘ - ||cJg||‘> 

G'G»I i=l 


(4.21) 


If we define the matrix 


H “ IC ul‘ C u2 C uL* C fl* C f2'**' C FL 1 

— ■ — — * ■ 


(4.22) 


columns 


M 2 columns 


J(s) = min tr{G'HSH'G} 
G'G-I 


(4.23) 


where 



It is straightforward (see [3]) to show that a minor modification of the 
result in [17] leads to the following solution. We perform an eigenvector- 
eigenvalue analysis on the matrix 


HSH' * U A O' (4.25) 

where U'U « I and 

A - diag (A^,..., A N ) (4.26) 

with A < A < ... < X M and U * [u. ‘ . . . . *.u M ] . Then the optimum choice of G 
1 — 2 - • N 1. .N 

is 


G - [u i: 


(4.27) 


and the corresponding value of (4.23) is 
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* 


JMs) * r A 
i*l 


(4.28) 


Let us make two comments about this solution. The first is that as many 
as of the X^ can be negative. In fact the parity check based on u^ is 
likely to have larger values under failed rather than unfailed conditions 
if and only if A <0. Thus we immediately see that the maximum number of 
useful parity relations for detecting this particular failure mode equals 
the number of negative eigenvalues of HSH*. As a second comment, let us 
contrast the procedure we use here with a singular value decomposition, which 
corresponds essentially to performing an eigenvector-eigenvalue analysis of 
HH’. First, assume that the first K of the ^ are negative. Then, define 

A. * -V A * - X 2 °K * -\t' 


2 , 2 . 
°K+1 = K+1"* # '°N * A N 


(4.29) 


From (4.25) we have that 


HSH' = UlSEU' 


where 


£ « diag (a ,,... ,o ) 

1 N 


(4.31) 


Assuming that I is nonsingular, define 


V - E”Vh 


(4.32) 


Then (4.31), (4.32) imply that V is S-ortho gonal 


VSV* « S 


(4.33) 


and that H has what we call as S -singular value decomposition 


H ■ UEV 


(4.34) 
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